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4lJ '. Abstract 

O 

j^ I We have analyzed the equilibrium response of chain molecules to stretch- 

ing. For a homogeneous sequence of monomers, the induced transition from 

^ I compact globule to extended coil below the ^-temperature is predicted to be 

O 

sharp. For random sequences, however, the transition may be smoothed by 

►^ ^ a prevalence of necklace-like structures, in which globular regions and coil 

\^ ' regions coexist in a single chain. As we show in the context of a random 

trr* I copolymer, preferential solvation of one monomer type lends stability to such 

structures. The range of stretching forces over which necklaces are stable is 

C^ I sensitive to chain length as well as sequence statistics. 
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Experiments probing the mechanical response of individual protein molecules have 
demonstrated that certain domains can withstand significant stretching forces before un- 
folding ||l|-§[ . At strain rates much greater than unperturbed rates of unfolding, measured 
restoring forces imply that the native structures remain largely intact up to a threshold 
force, at which they unfold to an extensible state. Threshold forces for these domains are 
typically in the range ~ 10-100 T /a, where T is temperature (in units such that Boltzmann's 
constant is unity) and a is an average distance between neighboring, connected monomers 
in the chain. To be sure, these experimental systems are out of equilibrium, as highlighted 
by wide hysteresis upon releasing strain and broad distributions of threshold forces. Nev- 
ertheless, it is reasonable to expect that such proteins undergo sharp unfolding transitions 
under adiabatic stretching conditions as well. 

The microscopic origin of this mechanical strength is not well understood. As Wolynes 
and coworkers have pointed out, small forces applied to the end-to-end distance of a protein 
may couple very weakly to the reaction coordinate for folding ||^. Atomically detailed simu- 
lations of rapid stretching instead suggest an important role for certain backbone topologies 
that are stabilized by groups of hydrogen bonds |^. In this Letter we explore the equilibrium 
force-induced transition at a coarse-grained level, using simple estimates of the relevant free 
energetics. Specifically, we distinguish between aspects of the transition that are homopoly- 
meric in nature, and those that arise from the quenched disorder characterizing random 
heteropolymers. We show that necklace- like structures, as depicted in Fig. |l] (c), occur with 
low probability in long homopolymers, but may be stabilized over a finite range of force 
and temperature by sequence heterogeneity. We describe the features of sequence statistics 
that affect this stability, and thereby determine mechanical strength. Our results may thus 
shed light on evolutionary design principles for proteins whose functions are mechanical in 
nature. 

We consider first a chain of A^ identical monomers in solution. In the absence of stretching 
force and below the ^-temperature {9), the chain adopts a compact, globular conformation 
(Fig [H (a)). Because this state exhibits only small fiuctuations in monomer density, its free 



energy is well approximated using mean field theory ^: 

Fg ~ EAT + 7Ar2/3 _ j.g^_ (^) 

Here, B is the excess free energy per particle of a fluid of unconnected monomers at the 
same temperature and density as the globule. The second term in Eq. [1| includes surface 
energetics of the globule-solvent interface, as well as the conformational entropy ^sph lost 
upon confining the chain to a spherical volume with radius R ~ N^^^. Sq is the entropy of 
an ideal, unconfined chain. 

We imagine that the principal effect of a small stretching force, /, on the ends of a 
homopolymeric globule is to deform its spherical geometry (Fig. ^ (b)). The favorable 
energy of this deformation, — /(-Ry — R), is offset by surface energetics as well as a loss 
of entropy, ^d. We estimate this entropy loss through the statistics of a Gaussian chain 
confined to a deformed volume. In the long-chain limit, the free energy per monomer of 
such a chain is isomorphic with the ground state energy of a quantal particle confined to 
the same volume [^. Treating the asymmetric boundary condition as a perturbation ||10 



we sum an infinite class of terms in the ground-state expansion [11|, obtaining 



Because 5'sph ~ A^^^^, the energy gained by a reasonable deformation of the globule {R\\ ~ 
N^^^), is comparable to Sd only for forces of magnitude N^^^ or larger. For long chains, the 
deformation achieved by stretching forces ~ T/a is therefore negligible. We subsequently 
consider the globule to be undeformed, and the globular free energy to be unaffected by 
stretching. 

For sufficiently large stretching forces, the dominant state of a homopolymer is an ex- 
panded coil (Fig. |I] (d)). In contrast to the globule, this state is characterized by exten- 
sive density fluctuations and is strongly susceptible to deformation. Considering short- 
ranged attractions between monomers to be unimportant in this case, we model the ex- 
tended coil as a freely jointed chain. The free energy of such a chain [§], Fc{N) = 



— NT \n[smh (fa/T) / (fa/T)] — TSq, is sensitive to the magnitude of /. At a given tem- 
perature T < 6, a, phase transition occurs at a force sufficient to lower the coil free energy 
below that of the globule. (See inset of Fig. |^.) For N"^/'^ ^ 1, the phase boundary is given 
by 

B + -fN-^/^ = -T In [sinh {fa/T)/{fa/T)]. (3) 

The resulting phase diagram in the force-temperature plane is shown in Fig. |^ for various 
A^. Here, we have taken B = T — 9 (accurate near T = 6), which is unrealistic at low 
temperatures. It is reasonable, however, that as temperature decreases, stretching energy 
density of the strongly fluctuating coil grows more quickly than attractive energy density of 
the relatively placid globule. As a result, a reentrant coil phase appears at low temperature. 
The "coil" in this case is a nearly straight chain with small fluctuations in extension. Com- 
puter simulations of strained lattice heteropolymers [|12| appear to support our prediction 
of reentrance. 

Although the globule-coil transition is second-order in the absence of stretching [^], the 
force-induced transition is here predicted to be first-order, since the average chain extension 
is discontinuous at the phase boundary. This result is certainly correct for T < 6, where 
globule and coil phases are distinct. Near T = 6 and / = 0, however, our caricatures of 
these states are oversimplified: Density fiuctuations are not negligible in the globule, and 
attractions are not negligible in the coil. In this region of the phase diagram, extension 
increases smoothly with force, and the stretching transition is second-order. 

In constructing a phase diagram for homopolymer stretching, we did not consider chain 
structures that are necklace- like (i.e., coexisting globule and coil regions, as in Fig. |l] (c)). 
Neglecting surface effects, the free energy per monomer of a necklace lies between that of 
a globule and that of a coil. The entropy of a necklace is augmented by the freedom of 
globular regions to reside anywhere along the chain (provided they do not overlap). For a 
single globular region, however, this additional "translational" entropy scales as InA^, and 
is insufficient to overcome the 0{N) deficit in free energy to either globule or coil, even for 



chains of modest length. In the case of many globular regions, the gained entropy, ~ A^ In 2, 
is considerable, but does not compensate for the cost of presenting an extensive surface to 
the solvent. 

The stability and importance of necklace structures may be qualitatively different for 
heteropolymers. In particular, fluctuations in local sequence composition yield a preference 
for globule or coil that varies along the chain. We assess the strength of this effect and 
its influence on stretching behavior, for a two-letter random copolymer. In this model, 
each monomer has two possible identities, cxj = ±1, perhaps denoting hydrophobic and 
hydrophilic moieties. For a given sequence {ctj}, the energy of a chain conformation defined 
by monomer positions Tj is 

H = H^ + T Y. ^.-f-(riv-ri), (4) 

i exposed 

N 

T^o= J2 ^(^i " ^j)iBo + XO-iO-j), (5) 

where Bq is a mean attractive energy density stabilizing the globule at low temperatures. 
We consider % < 0, so that attractions are strongest between monomers of the same type. 
The second sum in Eq. ^ includes only monomers that are exposed to solvent. For F > 0, 
monomers of type ctj = — 1 are favorably solvated, while exposure of o"j = 1 monomers 
is unfavorable. In Ref. |T^, the phase diagram was determined for the model defined by 



7^0 ^^ the mean field level, using the replica trick to average over random sequences of 
monomer types 0Jl^ • The one-step replica symmetry breaking demonstrated in that work 



is consistent with a suitably chosen random energy model. In other words, the average 
thermodynamics are reproduced by drawing {a^ /v)^ energy levels at random from a dis- 
tribution P{E) = (7rA^A^)~^/^exp [—{E — E)'^/NA'^], with variance A = |x|/^^P and mean 
E = BqN]). Here, v is the volume occupied by a monomer, p ~ v^"^ is the mean density in 
the globule core, and /i is the variance of the monomer distribution. Here we describe the 
effect of solvation and stretching energetics on the effective distribution of energies [|TT] . 



For a compact, spherical globule with surface area A, the solvation contribution in Eq. 



broadens the energy distribution, increasing A by {T'^J)/4:\x\)A/N . This result is obtained 
following the analysis of Ref . |T^ , with the additional assumption that the spatial pattern 
of solvent-exposed monomers is independent of compact chain conformation. As in the case 
of the homopolymer, we neglect the energetic contribution of globule deformation. The 
resulting distribution of energies is dominated by states in the interval E — A^^/^A < E < 
E + N'^^'^A. At energies just below a critical value, E* = E — N A{\n a^ / vY^'^ , the number of 
states is 0(1), while just above E* the number is exponentially large. The ground states of 
particular random sequences are distributed narrowly about E* [Q. Introducing solvation 
energetics thus lowers the average ground state energy by an amount (r^p/4|x|)(lna^/f )^/^74. 
If X and r are comparable in magnitude, this shift is a significant fraction of the energy gained 
by exposing only monomers with ctj = — 1 to the solvent. The density of states around E* 
in this case is sufficiently large that a ground state with favorable solvation energetics may 
always be selected. 

In an extended coil, nearly all monomers are exposed to solvent. For sequences with fixed 
total composition, Y,f=i o"i = 0, thermodynamics of the coil state are unaffected by solvation 
energetics. We consider the heteropolymer coil to have essentially the same physics as the 
homopolymer coil described above. 

For necklace structures of a random heteropolymer, free energy depends on the positions 
of globular regions along the chain. In effect, these globules move in a random potential gen- 
erated by sequence disorder. (See Fig. ^.) The scale and correlation length of this potential 
are determined by the size of the globule, and by statistics of the sequence. Although the 
total composition of the chain is fixed, a globular region with n < N monomers, situated at 
a given point on the chain, has in general an excess of one monomer type: 

g-^ E -^y'0■ (6) 

iSglobule 

Because the local composition g is a sum of many independent random variables, its distri- 
bution is Gaussian, with variance /in^^/^. This distribution of sequence compositions leads 
to localization of globules along the chain. 
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(7) 



At a particular location, a globular region has an apparent distribution of monomer types 
that is modified according to the local value of q, p{o-i; q) oc exp [—{(Ji — g)^/2/i^(l — q^)]. 
The ground state energy of the globule at this location, determined from the random energy 
model described above, is 

/ a3\i/2r p2^ 

E* = (Bo - Ixlg^)^ - Tqpn^/^ - In — 21x1/^^(1 - q'^)pn + -j^.n^' 

\ v) V 4|x| 

Since local composition is not fixed for a heteropolymer, E* is a random variable for different 

globule locations. But its fiuctuations are weak: {{5E*YY/'^ = T'pixn^/^ + 0(ra°), where 

angled brackets denote an average over the distribution of q. 

For a single globular region incorporating n monomers, the remaining N — n monomers 

of the necklace belong to coil regions. When the globule resides at a given chain location 

with composition g, the complementary composition in coil regions is —q. Consequently, 

the solvation energy of these fully exposed regions, — Fng, is also a random variable, with 

fiuctuations of magnitude F/ira^/^. These relatively strong fiuctuations in coil solvation 

energy establish the scale of the random potential for globule motion along the chain. Below 

a critical temperature Tc, the globule will become localized in the deepest minimum of this 

potential. Following Derrida's analysis of randomly distributed energies |T^, we find that 

Tc = F/in^/^/v^, and that the free energy due to globule motion is 

{-sT[l + {Wl T>T, 

-Trand — < \o) 

I -2sTc, T < Te. 

In Eq. H s is the logarithm of the number of possible globule locations. The number 
of statistically independent locations is approximately a factor of n smaller, so that s is 
appropriately renormalized, s ~ In (N/n) — 1. For T ^ T^, randomness of the potential is 
irrelevant, and the homopolymer result, ~ TlnA^ is recovered. 

The analysis for a single globule is readily generalized for a necklace with several globular 
regions. In this case, e'' ~ n'^'^N -n + 1){N -2n + l) ...{N - Mn + 1)/M!, where M 
is the number of globules. For small globules, n = 0(1), the minimum value of -Frand is 
obtained for M = 0{N), giving minFj-and = 0{N^^'^). For large globules, n = 0{N), M 



is necessarily 0(1), and again minFj-and = 0(A^^/^). Globule sizes are thus expected to 
be distributed broadly, with a preference for a small number of large globules dictated by 
surface tension. Due to the 0{N^^'^) stabilization provided by globule localization, a phase 
rich in necklace structures covers an appreciable range of force and temperature for A^ = 10^ 
(Fig. ^ (a)). Indeed, simulated stretching of a short, nearly random heteropolymer involves 
partially extended structures |T^. For A^ = 10'^ (Fig. ^ (b)), the necklace phase covers a 



much smaller region of the phase diagram, due to the extensive free energy cost of mixing 
globule and coil structures. Although necklaces become unstable as A^ — > oo, our results 
suggest that they may be prevalent for macromolecules of finite size. 

From our results for uncorrelated sequence statistics, we may deduce the basic effects of 
introducing correlations. For "blocky" sequences, in which monomers within a correlation 
length ^ are likely to be of the same type, fluctuations in local composition are large. Specif- 
ically, {{SqY) = 0{n^) for n < ^. If ^ scales with chain length (or if sequence correlations 
decay algebraically with distance along the chain), the resulting free energy contribution 
due to localization is 0{N), and necklace structures persist as A^ — > oo. By contrast, se- 
quences that are anticorrelated on a scale ^ exhibit small fluctuations in local composition. 
If {{Sq^) = 0(n^^/^), fluctuations in solvation energy are 0(1), and the stretching behavior 
will be homopolymeric. 

The equilibrium behavior of strained polymers described in this Letter is not directly 
related to the intrinsically dynamical, nonequilibrium response measured in experiments. 
Nonetheless, it is reasonable to associate the equilibrium stability of various structures with 
their kinetic accessibility. In particular, when necklaces are stable over a wide range of 
stretching forces, the kinetic transition from compact globule to extended coil is likely not 
sharp. Instead, we expect that the chain visits an ensemble of necklace structures as it passes 
through the transition region. The breadth we have predicted for this ensemble in the case 
of random heteropolymers suggests that the restoring force to applied strain should exhibit 
large fluctuations as the chain is stretched. Such a scenario has in fact been observed for 



the protein barnase [|T8|. The observation that certain protein domains do undergo sharp 
stretching transitions is thus an indication of evolutionary design for mechanical strength, 
as is reflected by the roles of such proteins in cell adhesion and muscle elasticity. 
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FIGURES 
FIG. 1. Possible states of a strained polymer: (a) compact, spherical globule; (b) compact 

globule, deformed from a spherical geometry, with extension 2i?|| in the direction of stretching; (c) 

necklace of alternating compact and non-compact regions; and (d) fully non-compact, extensible 

coil. 
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FIG. 2. Phase diagram of a homopolymer subject to an applied force, /, on the end-to-end 
distance, as estimated by Eq. |3|. The boundary between globule and coil phases is drawn for 
N = 10^ (dot-dashed line), N = 10^ (dotted line), N = 10'' (dashed line), and A^ ^ oo (sohd hue). 
These results are obtained for a surface energy density that is comparable to monomer interactions, 
7/0 ~ 1. Inset: Schematic picture of the influence of stretching on the free energy of extension 
for T < 6. As force is increased (/i < /2 < /s < fi), the coil state first becomes metastable 
(represented by the local free energy minimum at large R) and then stable. Free energy of the 
globular state (represented by the local minimum at small R) is not sensitive to /. 
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FIG. 3. Motion of a globular region along a random heteropolymer. A sequence of monomer 
identities is represented schematically by black and white circles on the chain. Different sequence 
locations, x, of the globule result in different compositions (here, fractions of black and white 
circles) of the globule and of the extended coil regions. Consequently, the ground state energy 
of the globule and solvation energy of the coil depend on x. Because the monomer identities 
are independent random variables, the globule effectively experiences a random, one-dimensional 
potential u for translation along the sequence. 
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FIG. 4. Phase diagram of the random copolymer defined by Eq. Q for (a) N = 10^ and (b) 
N = 10^, estimated using mean-field arguments described in the text. Shading denotes regions 
dominated by necklace-like structures. Boundaries of these regions are taken to be chain compo- 
sitions of 75% globule (lower dashed lines) and 25% globule (upper dashed lines). The solid line 
denotes a chain composition of 50% globule, i.e., N/2 monomers belong to globular regions, and 
N/2 belong to coil regions. The chain is assumed to be flexible on the scale of monomer size, 
pa^ ra 1. 
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